436 research outputs found

    Экотуризм на примере природоохранной деятельности на муниципальном уровне: Хостенин

    Get PDF
    Описывается поселение Хостенин, в котором при поддержке муниципальных органов власти используются альтернативные источники энергии. Данное поселение приобрело известность за счет экологических проектов, участвующих в использовании местных ресурсов, сохранения и использования возобновляемых источников энергии, в частности солнца и биомассы, а также экологически безопасных технологий, поддерживающих устойчивое развитие местности с середины 1990-х годов

    Matching Subsequences in Trees

    Full text link
    Given two rooted, labeled trees PP and TT the tree path subsequence problem is to determine which paths in PP are subsequences of which paths in TT. Here a path begins at the root and ends at a leaf. In this paper we propose this problem as a useful query primitive for XML data, and provide new algorithms improving the previously best known time and space bounds.Comment: Minor correction of typos, et

    Supermetric search with the four-point property

    Get PDF
    Metric indexing research is concerned with the efficient evaluation of queries in metric spaces. In general, a large space of objects is arranged in such a way that, when a further object is presented as a query, those objects most similar to the query can be efficiently found. Most such mechanisms rely upon the triangle inequality property of the metric governing the space. The triangle inequality property is equivalent to a finite embedding property, which states that any three points of the space can be isometrically embedded in two-dimensional Euclidean space. In this paper, we examine a class of semimetric space which is finitely 4-embeddable in three-dimensional Euclidean space. In mathematics this property has been extensively studied and is generally known as the four-point property. All spaces with the four-point property are metric spaces, but they also have some stronger geometric guarantees. We coin the term supermetric space as, in terms of metric search, they are significantly more tractable. We show some stronger geometric guarantees deriving from the four-point property which can be used in indexing to great effect, and show results for two of the SISAP benchmark searches that are substantially better than any previously published

    Accelerating Metric Filtering by Improving Bounds on Estimated Distances

    Get PDF
    Filtering is a fundamental strategy of metric similarity indexes to minimise the number of computed distances. Given a triple of objects for which distances of two pairs are known, the lower and upper bounds on the third distance can be set as the difference and the sum of these two already known distances, due to the triangle inequality rule of the metric space. For efficiency reasons, the tightness of bounds is crucial, but as angles within triangles of distances can be arbitrary, the worst case with zero and straight angles must also be considered for correctness. However, in data of real-life applications, the distribution of possible angles is skewed and extremes are very unlikely to occur. In this paper, we enhance the existing definition of bounds on the unknown distance with information about possible angles within triangles. We show that two lower bounds and one upper bound on each distance exist in case of limited angles. We analyse their filtering power and confirm high improvements of efficiency by experiments on several real-life datasets

    Using MILOS to Build a Multimedia Digital Library Application: The PhotoBook Experience

    Full text link
    Abstract. The digital library field is recently broadening its scope of applica-bility and it is also continuously adapting to the frequent changes occurring in the internet society. Accordingly, digital libraries are slightly moving from a controlled environment accessible only to professionals and domain-experts, to environments accessible to casual users that want to exploit the potentialities offered by the digital library technology. These new trends require, for instance, new search paradigms to be offered, new media content to be managed, and new description extraction techniques to be used. Building digital library applications, and effectively adapting them to new emerging trends, requires to develop a platform that offers standard and powerful building blocks to support application developers. In this paper we discuss our experience of using MILOS, a multimedia content management system oriented to the construction of digital libraries, to build a demanding application dedicated to non-professional users. Specifically, we discuss the design and implementation of an on-line photo album (PhotoBook), which is a digital library application that allows people to manage their own photos, to share them with friends, and to make them publicly available and searchable. PhotoBook, uses a complex internal metadata schema (MPEG-7) and allows users to simply express complex queries (combining similarity search and fielded search), enabling them to retrieve material of interest even if metadata are impre-cise or missing.

    The Tree Inclusion Problem: In Linear Space and Faster

    Full text link
    Given two rooted, ordered, and labeled trees PP and TT the tree inclusion problem is to determine if PP can be obtained from TT by deleting nodes in TT. This problem has recently been recognized as an important query primitive in XML databases. Kilpel\"ainen and Mannila [\emph{SIAM J. Comput. 1995}] presented the first polynomial time algorithm using quadratic time and space. Since then several improved results have been obtained for special cases when PP and TT have a small number of leaves or small depth. However, in the worst case these algorithms still use quadratic time and space. Let nSn_S, lSl_S, and dSd_S denote the number of nodes, the number of leaves, and the %maximum depth of a tree S{P,T}S \in \{P, T\}. In this paper we show that the tree inclusion problem can be solved in space O(nT)O(n_T) and time: O(\min(l_Pn_T, l_Pl_T\log \log n_T + n_T, \frac{n_Pn_T}{\log n_T} + n_{T}\log n_{T})). This improves or matches the best known time complexities while using only linear space instead of quadratic. This is particularly important in practical applications, such as XML databases, where the space is likely to be a bottleneck.Comment: Minor updates from last tim

    Techniques for Complex Analysis of Contemporary Data

    Get PDF
    Contemporary data objects are typically complex, semi-structured, or unstructured at all. Besides, objects are also related to form a network. In such a situation, data analysis requires not only the traditional attribute-based access but also access based on similarity as well as data mining operations. Though tools for such operations do exist, they usually specialise in operation and are available for specialized data structures supported by specific computer system environments. In contrary, advance analyses are obtained by application of several elementary access operations which in turn requires expert knowledge in multiple areas. In this paper, we propose a unification platform for various data analytical operators specified as a general-purpose analytical system ADAMiSS. An extensible data-mining and similarity-based set of operators over a common versatile data structure allow the recursive application of heterogeneous operations, thus allowing the definition of complex analytical processes, necessary to solve the contemporary analytical tasks. As a proof-of-concept, we present results that were obtained by our prototype implementation on two real-world data collections: the Twitter Higg's boson and the Kosarak datasets

    Detecting Advanced Network Threats Using a Similarity Search

    Get PDF
    In this paper, we propose a novel approach for the detection of advanced network threats. We combine knowledge-based detections with similarity search techniques commonly utilized for automated image annotation. This unique combination could provide effective detection of common network anomalies together with their unknown variants. In addition, it offers a similar approach to network data analysis as a security analyst does. Our research is focused on understanding the similarity of anomalies in network traffic and their representation within complex behaviour patterns. This will lead to a proposal of a system for the realtime analysis of network data based on similarity. This goal should be achieved within a period of three years as a part of a PhD thesis

    Socioeconomic indicators and ethnicity as determinants of regional mortality rates in Slovakia

    Get PDF
    Regional differences in mortality might reflect socioeconomic and ethnic differences between regions. The present study examines the relationship between education, unemployment, income, Roma population and regional mortality in the Slovak Republic. Separately for males and females, data on standardised mortality in the Slovak population aged 20-64 years in the year 2002 were calculated for each of the 79 districts. Similarly the proportions of respondents with tertiary education, unemployed status, Roma ethnicity and income data were calculated per district. A linear regression model was used to analyse the data. Socioeconomic differences in regional mortality were found among males, but not among females. While education and unemployment rate significantly contributed to mortality differences between regions, income and the proportion of Roma population did not. The model explained 32.9% of the variance in standardised mortality rate among districts for males and 7.6% for females. Low education and high unemployment rate seems to be an indicator of regions with high mortality of male and therefore should be targeted by policy measures aimed at decreasing mortality in productive age

    Reference point hyperplane trees

    Get PDF
    Our context of interest is tree-structured exact search in metric spaces. We make the simple observation that, the deeper a data item is within the tree, the higher the probability of that item being excluded from a search. Assuming a fixed and independent probability p of any subtree being excluded at query time, the probability of an individual data item being accessed is (1−p)d for a node at depth d. In a balanced binary tree half of the data will be at the maximum depth of the tree so this effect should be significant and observable. We test this hypothesis with two experiments on partition trees. First, we force a balance by adjusting the partition/exclusion criteria, and compare this with unbalanced trees where the mean data depth is greater. Second, we compare a generic hyperplane tree with a monotone hyperplane tree, where also the mean depth is greater. In both cases the tree with the greater mean data depth performs better in high-dimensional spaces. We then experiment with increasing the mean depth of nodes by using a small, fixed set of reference points to make exclusion decisions over the whole tree, so that almost all of the data resides at the maximum depth. Again this can be seen to reduce the overall cost of indexing. Furthermore, we observe that having already calculated reference point distances for all data, a final filtering can be applied if the distance table is retained. This reduces further the number of distance calculations required, whilst retaining scalability. The final structure can in fact be viewed as a hybrid between a generic hyperplane tree and a LAESA search structure
    corecore